Skip to content

Add orientation-dependent caching checks and option to disable caching #4119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 22, 2025

Conversation

nmnobre
Copy link
Member

@nmnobre nmnobre commented Mar 27, 2025

Hi,

It turns out the problem I initially thought was to do with tetrahedral refinement is not specifically related to our refinement procedures, it is just more likely to happen when we refine.

There are cases in which two consecutive elements are sufficiently similar to trigger and reuse cached shape functions when, in fact, the orientation of the second element doesn't match that of the first. For example, two parallel edges aligned with the y-axis on neighbouring elements might not have the same orientation because the second has got a O(1e-16) difference in the x-coordinates of its two vertices for example. This is bound to happen when we refine and the resulting coordinates can't be exactly represented numerically (in base 2).

I've added a case to Vector FE Ex. 3: you expect grid_size=5 refine=2 to be the same as grid_size=20 (our default), but unless we disable caching, that isn't the case.

Of course an alternative to this would be, just like we do for caching, to do fuzzy comparisons for orientations as well. Opinions?

Cheers,
Nuno

@roystgnr
Copy link
Member

Of course an alternative to this would be, just like we do for caching, to do fuzzy comparisons for orientations as well.

I don't see how this would work. There has to be some single point at which an orientation flips; it's not something that can be non-transitive like the similarity we use in caching. I'd love to find out how I'm wrong, though, particularly if it's a foundational sort of mistake on my part. At some point we need to come up with orientation rules that can work with mesh displacement, and I haven't thought of anything better than "use unique_id() when it's enabled".

My first thought was to start doing non-fuzzy exact comparisons for caching, but that won't work well: a cartesian grid which should cache cleanly would then start hitting false negatives with any rotation or even with most grid spacings.

My only other idea is that we could make the "fuzzy" tolerance controlled by the FE type. Using a nodal or discontinuous type, it specifies <= TOLERANCE*TOLERANCE, and you still get the maximum reasonable caching; using a type that depends on orientations for vector or high-order scalar shapes, it specifies <= 0, and you get safety.

@nmnobre
Copy link
Member Author

nmnobre commented Mar 27, 2025

There has to be some single point at which an orientation flips; it's not something that can be non-transitive like the similarity we use in caching.

So, similarity is a relation between two elements, and I agree it's non-transitive. The question that needs to be asked is: if two elements are similar, can we guarantee they have the same orientations. If two elements are similar, we know vectors (0) -> (m) and (0) -> (n), where m and n are vertices of some edge, in the two elements are similar, i.e. their difference has l1-norm <= tol. It's obviously the case this doesn't guarantee (m) -> (n), i.e. the edge, is similar by the same standard. But, by the triangle inequality, we know it's similar if we relax the constraint to <= 2*tol. So could we not define our lexicographic comparisons with this in mind? EDIT: this is probably irrelevant anyway, because we want to one day support displaced meshes (see below).

My first thought was to start doing non-fuzzy exact comparisons for caching, but that won't work well

Agreed.

At some point we need to come up with orientation rules that can work with mesh displacement

Since this is a worry, and I agree we'll need to use a unique node id, and since in that case our similarity test is not good enough, i.e. it's no guarantee the orientation is preserved, I'm tempted to go with:

My only other idea is that we could make the "fuzzy" tolerance controlled by the FE type. Using a nodal or discontinuous type, it specifies <= TOLERANCE*TOLERANCE, and you still get the maximum reasonable caching; using a type that depends on orientations for vector or high-order scalar shapes, it specifies <= 0, and you get safety.

If we decide to go with this, do we still keep --disable-caching or do I remove that?

EDIT: Or, we could instead define --enable-caching-for-orientation-dependent-elems (you get the idea) that the user could use to get extra performance once they know their mesh is amenable to caching...
Or, yet another option, for these orientation-dependent elements, we add an extra check on the caching, the second element needs to be similar AND preserve orientation.

@moosebuild
Copy link

moosebuild commented Mar 27, 2025

Job Coverage, step Generate coverage on 18e6aae wanted to post the following:

Coverage

f2fd00 #4119 18e6aa
Total Total +/- New
Rate 63.56% 63.57% +0.01% 100.00%
Hits 75603 75633 +30 40
Misses 43348 43340 -8 0

Diff coverage report

Full coverage report

This comment will be updated on new commits.

@roystgnr
Copy link
Member

If we decide to go with this, do we still keep --disable-caching or do I remove that?

I definitely like having an option to disable caching. It's kind of like the option to use higher than double precision: we should never expect to need it, but if we ever suspect we need it then it's invaluable to be able to easily test that suspicion, either so we can quickly find and fix the problem or so we can rule out that suspicion. I guess we would want it to be a configure option rather than a command line option if we discovered that the extra runtime test hurt performance, but if we do the runtime option right then I'd be so surprised if someone saw a performance hit from it that I wouldn't trust their numbers without testing it myself. (I'll add some review comments re: doing it right)

the second element needs to be similar AND preserve orientation.

This isn't crazy. It'd be much more productive than the "set tolerance to 0 for orientation-dependent elements" idea.

include/fe/fe.h Outdated
* element we computed on to try to avoid calling
* init_shape_functions and compute_shape_functions
*/
bool caching;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a static member - even if anyone is doing something as weird as turning caching on for some FE objects and off for others, that means they're deep enough in the weeds that they could figure out how to do that by manipulating the static member.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this member entirely. I can put it back if you think we should?

src/fe/fe.C Outdated
@@ -47,6 +47,7 @@ namespace libMesh
template <unsigned int Dim, FEFamily T>
FE<Dim,T>::FE (const FEType & fet) :
FEGenericBase<typename FEOutputType<T>::type> (Dim,fet),
caching(!libMesh::on_command_line("--disable-caching")),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if we make this a static member, could we initialize it with a singleton? I really hate doing any kind of string comparisons for variable identifiers in places where they're going to be hit more than once; I still remember the first time I profiled a hypersonics code that turned out to be spending a plurality of its time in string parsing...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now (partially) using the Singleton functionality though I actually never create a Singleton object as it looks unnecessary here.

@nmnobre nmnobre force-pushed the cache branch 15 times, most recently from 6444e6e to 3c23bdc Compare April 30, 2025 10:54
@nmnobre
Copy link
Member Author

nmnobre commented Apr 30, 2025

This is ready for another look, it implements what we discussed, it's just a matter of style now I think.

  1. Do we want to initialise a static data member in the FE classes for the caching toggle?
  2. Did I miss any orientation-dependent families in the if conditions? Would we rather define a is_orientation_dependent() method similar to is_hierarchic()?
  3. Do you have a benchmark that benefits decisively from caching just to confirm we haven't lost any performance?

@nmnobre nmnobre force-pushed the cache branch 2 times, most recently from 6682c96 to 96532a6 Compare May 7, 2025 14:17
Copy link
Member

@roystgnr roystgnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think the cache validity checking is getting to a level of complexity that deserves being broken out into its own method, but IMHO this is still a step in the right direction as-is.

src/fe/fe.C Outdated
// Check if cached edge and face orientations are equal
if constexpr (T == HIERARCHIC || T == L2_HIERARCHIC || T == HIERARCHIC_VEC ||
T == L2_HIERARCHIC_VEC || T == BERNSTEIN || T == SZABAB ||
T == NEDELEC_ONE || T == RAVIART_THOMAS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move this logic into a static Elem:: array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think the cache validity checking is getting to a level of complexity that deserves being broken out into its own method

Done.

Could we move this logic into a static Elem:: array?

Moved to an FEInterface member function instead as suggested in (2) above.

@nmnobre nmnobre force-pushed the cache branch 4 times, most recently from c3292e4 to 7a2389c Compare May 10, 2025 00:16
@nmnobre nmnobre changed the title Add option to disable caching in FE::reinit() Cache more conservatively for orientation-dependent elements and add option to disable caching entirely May 12, 2025
@nmnobre nmnobre changed the title Cache more conservatively for orientation-dependent elements and add option to disable caching entirely Cache carefully for orientation-dependent elems and add option to disable caching entirely May 12, 2025
@nmnobre nmnobre changed the title Cache carefully for orientation-dependent elems and add option to disable caching entirely Add orientation-dependent caching checks and option to disable caching entirely May 12, 2025
@nmnobre nmnobre changed the title Add orientation-dependent caching checks and option to disable caching entirely Add orientation-dependent caching checks and option to disable caching May 12, 2025
@nmnobre
Copy link
Member Author

nmnobre commented May 12, 2025

I've benchmarked Vector FE Ex.4 with
for i in {1..5}; do (time ./example-opt grid_size=20 element_type=HEX20 -pc_type lu > /dev/null) 2>&1 | grep real; done, and saw no performance regression:
Before:

real	0m20.275s
real	0m20.247s
real	0m20.344s
real	0m20.391s
real	0m20.318s

After:

real	0m20.003s
real	0m20.288s
real	0m20.424s
real	0m20.201s
real	0m20.380s

After w/ --disable-caching:

real	0m21.396s
real	0m21.389s
real	0m21.371s
real	0m21.446s
real	0m21.565s

@nmnobre
Copy link
Member Author

nmnobre commented May 12, 2025

I want #4170 to be merged first so I can rebase this then.

{
bool m = cached_nodes.size() == elem->n_nodes();
for (unsigned n = 1; m && n < elem->n_nodes(); n++)
m = (elem->point(n) - elem->point(0)).relative_fuzzy_equals(cached_nodes[n] - cached_nodes[0]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For correctness: shouldn't this be &= rather than =? Likewise in the for loops below?

And if that's right, then, for performance: should we bother tracking m to return it? If only 99% of cached geometry and orientation matches then we can't reuse cached data, so we might as well return false as soon as we see any mismatch or return true at the end otherwise.

And either way, if I'm right and we don't want to merge this without changes: how about matches_cache() for the name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For correctness: shouldn't this be &= rather than =? Likewise in the for loops below?

We know m is true in the loop bodies, so the assignment is equivalent.

And if that's right, then, for performance: should we bother tracking m to return it? If only 99% of cached geometry and orientation matches then we can't reuse cached data, so we might as well return false as soon as we see any mismatch or return true at the end otherwise.

Yeah, I thought about that, but I really liked the symmetry with the code in the sister method. Besides, the loops short-circuit if m is false. For orientation-independent elements this means the code should be optimal (right?), for orientation-dependent elements, I can check m in the if-statement if you think it's needed.

And either way, if I'm right and we don't want to merge this without changes: how about matches_cache() for the name?

I can change the name anyway. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, man, I missed the short-circuiting in the for loop parentheses. This is a really weird idiom but yeah, looks correct, and it's basically optimal - if you break out of the first for loop you're still requesting 2 unnecessary tests of m, but I'll bet the optimizer figures that out and gives the same code regardless, and either way you're not doing any unnecessary floating-point.

Damn it, I knew that if I thought I caught you in a correctness error then I must have overlooked something.

Copy link
Member

@roystgnr roystgnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer the name change but I'm happy either way.

@nmnobre
Copy link
Member Author

nmnobre commented May 15, 2025

I'd prefer the name change but I'm happy either way.

Done.

@roystgnr roystgnr merged commit 19f06e2 into libMesh:devel May 22, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants